Differentially Private Ordinary Least Squares

نویسنده

  • Or Sheffet
چکیده

Linear regression is one of the most prevalent techniques in machine learning; however, it is also common to use linear regression for its explanatory capabilities rather than label prediction. Ordinary Least Squares (OLS) is often used in statistics to establish a correlation between an attribute (e.g. gender) and a label (e.g. income) in the presence of other (potentially correlated) features. OLS assumes a particular model that randomly generates the data, and derives tvalues — representing the likelihood of each real value to be the true correlation. Using t-values, OLS can release a confidence interval, which is an interval on the reals that is likely to contain the true correlation; and when this interval does not intersect the origin, we can reject the null hypothesis as it is likely that the true correlation is non-zero. Our work aims at achieving similar guarantees on data under differentially private estimators. First, we show that for wellspread data, the Gaussian Johnson-Lindenstrauss Transform (JLT) gives a very good approximation of t-values; secondly, when JLT approximates Ridge regression (linear regression with l2-regularization) we derive, under certain conditions, confidence intervals using the projected data; lastly, we derive, under different conditions, confidence intervals for the “Analyze Gauss” algorithm (Dwork et al., 2014).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Differentially Private Ordinary Least Squares

More specifically, we use Theorem B.1 from (Sheffet, 2015) that states that given a matrix A whose all of its singular values at greater than T ( , δ) where T ( , δ) = 2B (√ 2r ln(4/δ) + 2 ln(4/δ) ) , publishing RA is ( , δ)differentially private for a r-row matrix R whose entries sampled are i.i.d normal Gaussians. Since we have that all of the singular values of A′ are greater than w (as spec...

متن کامل

Differentially Private Ordinary Least Squares: $t$-Values, Confidence Intervals and Rejecting Null-Hypotheses

Linear regression is one of the most prevalent techniques in data analysis. Given a large collection of samples composed of features x and a label y, linear regression is used to find the best prediction of the label as a linear combination of the features. However, it is also common to use linear regression for its explanatory capabilities rather than label prediction. Ordinary Least Squares (...

متن کامل

HMOs and Patient Trust in Physicians:

Patients in health maintenance organizations (HMOs) often trust their physicians less than those with other forms of insurance coverage. Recent studies have reported a backlash of criticism against managed care, especially HMOs. In response to this, some have argued that managed care has become more responsive to patients’ wishes. We use data from the Community Tracking Study (CTS) Household Su...

متن کامل

Nearly Optimal Private LASSO

We present a nearly optimal differentially private version of the well known LASSO estimator. Our algorithm provides privacy protection with respect to each training example. The excess risk of our algorithm, compared to the non-private version, is Õ(1/n), assuming all the input data has bounded `∞ norm. This is the first differentially private algorithm that achieves such a bound without the p...

متن کامل

Modeling Market Shares of the Leading Personal Automobile Insurance Companies

Private passenger automobile insurance companies employ a range of strategies and tactics to achieve their growth and profitability objectives. Gains and losses in market share among insurers suggest a fair degree of rivalrous behavior; however, previous econometric analyses have not adequately addressed the sources of firm-level advantages. Although prior studies have tested hypotheses about d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017